Rescheduling for Locality in Sparse Matrix Computations
نویسندگان
چکیده
In modern computer architecture the use of memory hierarchies causes a program's data locality to directly aaect performance. Data locality occurs when a piece of data is still in a cache upon reuse. For dense matrix computations, loop transformations can be used to improve data locality. However, sparse matrix computations have non-aane loop bounds and indirect memory references which prohibit the use of compile time loop transformations. This paper describes an algorithm to tile at runtime called serial sparse tiling. We test a runtime tiled version of sparse Gauss-Seidel on 4 diierent architectures where it exhibits speedups of up to 2.7. The paper also gives a static model for determining tile size and outlines how overhead aaects the overall speedup.
منابع مشابه
ICCS 2001 talk: Rescheduling for Locality in Sparse Matrix Computations
Multigrid solves a set of simultaneous linear equations optimally. That is the time and space used by the multigrid computation are proportional to N , the number of unknowns. In reality the performance of any computation is not just dictated by how many operations performed. The order of the operations and the layout of data in memory determine how well levels of cache in the target architectu...
متن کاملUsing Sparse Tiling with Symmetric Multigrid
Good data locality is an important aspect of obtaining scalable performance for multigrid methods. However, locality can be difficult to achieve, especially when working with unstructured grids and sparse matrices whose structure is not known until runtime. Our previous work developed full sparse tiling, a runtime reordering and rescheduling technique for improving locality. We applied full spa...
متن کاملProgram Performance Michelle
Many scienti c applications require sparse matrix computations. For example, Finite Element modeling and N-body simulations. It is di cult to write these codes in a portable way which also achieves high performance because of the sparsity of the matrices and because current architectures have deep memory hierarchies and multiple levels of parallelism. Therefore the implementation of such comput...
متن کاملAlgorithms + Data Structures + Transformations = Portable Program Performance
Many scientiic applications require sparse matrix computations. For example, Finite Element model-ing and N-body simulations. It is diicult to write these codes in a portable way which also achieves high performance because of the sparsity of the matrices and because current architectures have deep memory hierarchies and multiple levels of parallelism. Therefore the implementation of such compu...
متن کاملTechnical Report BU-CE-1201 Hypergraph-Partitioning-Based Models and Methods for Exploiting Cache Locality in Sparse-Matrix Vector Multiplication
The sparse matrix-vector multiplication (SpMxV) is a kernel operation widely used in iterative linear solvers. The same sparse matrix is multiplied by a dense vector repeatedly in these solvers. Matrices with irregular sparsity patterns make it difficult to utilize cache locality effectively in SpMxV computations. In this work, we investigate singleand multiple-SpMxV frameworks for exploiting c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001